Skip to the content Back to Top

The Ontario Jewish Archives, Blankenstein Family Heritage Centre (OJA) has recently launched a new website which includes two exciting new features.  Andornot has been working with OJA staff for the past seven months to build a sophisticated search interface to their archival records, and to create an interactive mapping of Jewish Landmarks of Ontario.

A wealth of content related to Jewish history in the province of Ontario is federated and now searchable from a single search box. Much of this is publicly available for the first time and includes:image

  • close to 25,000 archival descriptions
  • selected archival accessions
  • oral histories and interviews
  • historical landmarks
  • Toronto Jewish city directories
  • ship passenger manifests
  • website and online exhibits
  • images, audio, video, digitized text

The OJA came into the project with a specific vision for their site as well as a set of requirements for searching, sorting and displaying results. Results from all data sources are intermingled and facets may be selected to narrow the results by data source, the collection and description level for descriptive records, format, decade, subject, name, and place. Results can also be limited to records with images or video or other types of digital content.

Some of the neat features include:image

  • The provenance is indicated with a hierarchical tree to show the context in which descriptive records were created.
  • For website content pages, the search term is highlighted in a snippet on the results page to show context.
  • Add to a List option allows users to print selected records, or create a PDF, or email their search results.
  • Clicking on an image automatically displays an overlay with dynamically generated and watermarked larger version.

A really helpful feature when dealing with proper names and places is the Did you mean or spell checking functionality. clip_image006 So a search for Eglington will bring up a message suggesting Eglinton instead.  Even if users know the right spelling, this is great for catching typos.

The Jewish Landmarks of Ontario currently includes points of interest in the Kensington Market/Spadina area of Toronto, but will be expanding to include neighbourhoods, towns and cities from around the province.  imageThese historical buildings and sites are pinpointed on an interactive map using data in the Landmarks database, and are accompanied by photos, documents, and audiovisual material pulled from the other databases.

The website was designed by Emerson Media and is hosted on the OJA servers.  The search interface is hosted by Andornot and incorporates the same templating and styles for a seamless transition.  Updated records and images are synchronized nightly based on certain criteria, allowing OJA to choose when a record is ready for publication on the website.

OJA has used Inmagic DB/TextWorks software along with the Andornot Archives Starter Kit for many years to manage their accessions and descriptive records.  Their oral histories database was expanded for this project and we worked with the OJA to create a new, linked Landmarks database.

The search interface is built with the Andornot Discovery Interface (AnDI). AnDI is an ASP.NET MVC web application that leverages the open-source Apache Solr search engine. Solr is fast, can handle very large data sets, and has excellent and highly configurable search algorithms and relevancy rankings.  AnDI adheres to the Dublin Core Metadata standard, with imported data mapped to fields in the Dublin Core element set. This permits multiple data sources, each with different schema, to be indexed, searched and presented in a single discovery interface.   Some modifications were made to the existing OJA databases to better utilize the search features in AnDI but apart from this, staff have been able to continue their regular routines without needing to learn any new software.

The landmarks map makes use of LeafletJs, an open-source javascript library for mobile-friendly interactive maps, and the Google Maps API. AnDI's responsive and mobile-friendly UI was built with the Zurb Foundation CSS framework.

As illustrated by this project, AnDI can be applied to search multiple disparate data sources, thus providing a user friendly interface whilst allowing the archives to maintain their archivist-oriented internal systems and workflow.

We are delighted with the new site, and the feedback we have received from OJA staff has been incredibly positive:

“I would like to extend our thanks to all of you for your hard work over the last year in helping make our new site a reality. This has been a monumental undertaking for our tiny staff of three. I think the site accomplishes what we first set out to do – engage users with different interests and skill sets and expose the richness of the records that we have been entrusted to safeguard on behalf of the Jewish community of Ontario.

Your professionalism, skills and problem-solving abilities have been of tremendous value to us and we are grateful for the time that you have spent trouble-shooting to make sure that everything works at its best. It has been a pleasure working with you.“ [Donna Bernardo-Ceriz, Assistant Archivist]

The College of Registered Nurses of BC (CRNBC) celebrated its 100th anniversary in 2012. To commemorate this milestone, a project was undertaken to digitize and make available online many decades of CRNBC publications, such as newsletters and annual reports. This collection documents the history of the college and the many nurses who contributed to its first 100 years, and perhaps most importantly, easily enables tracking of important decisions over the decades.

Printed copies of the publications were digitized by a service bureau, with Andornot then developing the online search and presentation system.

The new site is available at https://archives.crnbc.ca

As shown in the flowchart below, the workflow from print to online involved several stages and processes.

  • The service bureau scanned the documents to specifications developed by Andornot, producing thousands of high-resolution TIFF images – one image for each page of each publication – as well as associated XML in ALTO format containing the full text extracted from the scanned images through an OCR process. 
  • Andornot developed scripts to extract metadata from these many separate files, such as the name and date of the publication, and to generate images in different sizes as needed for the interface. We used PowerShell, ImageMagick and djvulibre for this.
  • Andornot developed a search engine using the Andornot Discovery Interface (AnDI) to provide the best possible keyword searching. 

The interface and features were tailored to the specific needs of this project:

  • Brief search results show details of the publication and a snippet of text showing the user’s search words highlighted, as well as a thumbnail image of the page containing the text, and facets to limit by date and publication. 
  • Clicking through to the full record shows the page in greater detail, but still with the search words highlighted. As well, the surrounding pages of the publication are also available allowing quick navigation through the entire publication. This was achieved through the use of the New York Times Document Viewer and custom programming to highlight text in an overlay layer.
  • A PDF of the full document is also available for download. Andornot created these by stitching together the separate images files for each page back together into a single file.
  • Permalinks allow users to easily bookmark and share specific pages and documents.

Often in a digitization project, the result might be a single PDF per publication. With this project, by having each page available as a separate image, we were more easily able to direct the user to the page and text they are most interested in, though they can still access a PDF of the entire document – the best of both worlds.

All of this complexity comes together to provide an elegant and intuitive interface for users.

A CRNBC staffer using the archive says, “This archive is awesome! We were able to search several decades of a policy issue in a short time, so we could draft an historical timeline showing policy changes right up to 2013! Searching this database saved us so much time.”

Contact Andornot for help with your own digitization project.

Version 1.4 of VuFind, a leading open-source discovery interface, has just been released. Andornot recommends VuFind as a great tool for integrating content from library catalogues and other databases and making it searchable with features and tools users expect from a web interface, such as:

  • spelling corrections and "did you mean?" suggestions of alternate terms and records;
  • facets, such as subject and author, to quickly refine a search;
  • options to save, email, bookmark and share searches and results; and
  • integrated content from external sources, such as book covers, reviews, and author biographies.

With version 1.4, a new "collections" feature makes VuFind an excellent choice for a discovery interface for archival and similar collections where information is arranged hierarchically (e.g. fonds, series, item, etc.).

These are two excellent examples of hierarchical collections within a VuFind system:

A more general VuFind demo is available at http://vufind.org/demo

More information about VuFind is available from this page and the VuFind website.

Contact Andornot to discuss how a discovery interface can provide the best search experience and features for your users, no matter what type of information it includes.

 

Two articles written by Jonathan Jacobsen have now been published in the UK online publication FreePint.

The abstract of the subscription only, full-length article is available here:

http://web.freepint.com/go/sub/article/69688

And a shorter free access version “Power Through Disparate Datasets with Discovery Interfaces” at:

http://web.freepint.com/go/features/69689

Please contact us if you would like to discuss the topics covered and how we can help you implement these ideas in your organization!

The Canadian Conservation Institute (CCI) in Ottawa, a long-time Andornot client, required a new version of their bilingual online catalogue and staff bibliography that would pass the strict requirements of W3C’s Web Content Accessibility Guidelines (WCAG). Andornot helped CCI boost the requirement into an opportunity to add new features, including facets, multi-database search, spelling suggestions, and faster search performance.

The CCI Library has one of the largest conservation and museology collections in the world. The collections are regarded as an important source for conservation and museology literature on a wide variety of topics, such as preventive conservation, industrial collections, architectural heritage, fire and safety protection, museum planning, archaeological conservation, preservation in storage and display, exhibition design, disaster preparedness, and museum education. The holdings include a large selection of books on textiles, furniture, paintings, sculptures, prints and drawings, and archaeological and ethnological objects.

-- "CCI Library". Canadian Conservation Institute. Retrieved 4 July 2012.

cci-results-facetedThe upgraded website uses the Andornot Discovery Interface (AnDI for short), a modern and highly configurable web application that tempers cutting-edge open source search technology with many years of Andornot experience in search-focused design.

It was possible to meet WCAG compliance because AnDI provides complete control over every HTML tag and CSS statement. The HTML5 structure presents a clean cross-browser template that reads well on mobile devices and has backwards-compatible support for older browsers.

The CCI Library's French and English versions were created with AnDI's built-in multilingual support, and are triggered through the presence of "en" or "fr" in the URL. Moving from one to another is a smooth transition: a user can switch the page language at any time without interrupting their experience or being redirected to a start page. Even errors and page-not-found messages are bilingual.

Facets and spelling suggestions (and many other features) are made possible by AnDI's open source search technology: Apache Solr. Solr is blazing fast, optimized for full-text search on the web, and relied on by some of the biggest names on the internet.

Every page is bookmarkable because the URL always holds the information needed to reconstruct the page. This makes the site friendly to permalinks and Search Engine Optimization (SEO).

The CCI Library retains its catalogue and staff bibliography collections in separate Inmagic DB/TextWorks databases that staff continue to update through its familiar desktop interface. Updates are extracted and indexed by Solr automatically on a regular basis via Andornot's Data Extraction Utility (internally we nickname it 'Extract-o-matic') from a Powershell script. The index schema is a Dublin Core derived metadata element set that Andornot helped to map to both collections.

 

andi-element-lozenge-1.0_188x188AnDI can be configured to reflect any field set from any data store or database, as well as rich documents such as PDF and Word, images with EXIF metadata, etc. Contact Andornot about AnDI for your own collection.

Categories

Let Us Help You!

We're Librarians - We Love to Help People